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SUMMARY 


1 

This  research  is  devoted  to  investigating  how  Bayesian  sta- 
tistical analysis  differs  from  classical  statistical  analysis  in  the 
context  of  operational  testing.  The  specific  aspects  of  operational 
testing  which  are  considered  are  the  power  resulting  from  a hypothesis 
test  and  the  expected  loss,  or  risk,  resulting  from  a decision. 

First  it  is  shown  that  it  is  quite  difficult  to  develop  a 
meaningful  measure  of  comparison  between  Bayesian  and  classical 
analysis  in  the  framework  of  hypothesis  testing.  Using  the  power  of 
the  hypothesis  test  as  a measure  of  comparison,  it  is  shown  that  under 
certain  conditions  classical  statistical  procedures  lead  to  more  power- 
ful tests  than  Bayesian  procedures.  It  is  then  shown  that  Bayesian 
statistical  procedures  are  superior  to  classical  procedures  in  the 
framework  of  minimizing  expected  loss  or  risk. 
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CHAPTER  I 

INTRODUCTION 

Background 

This  study  was  prompted  by  the  desire  of  the  U.  S.  Army  Opera- 
tional Test  and  Evaluation  Agency  (OTEA)  to  compare  Bayesian  to  classi- 
cal statistical  procedures  for  determining  sample  sizes  for  actual 
tests  which  have  been  conducted  by  OTEA.  The  objective  of  the 
comparison  is  to  determine  if  smaller  sample  sizes  can  be  obtained 
through  the  use  of  Bayesian  procedures  which  yield  inferences  compar- 
able to  those  drawn  from  classical  procedures.  To  understand  the  pro- 
cedures to  be  utilized  in  this  study,  one  must  be  familiar  with  the 
nature  of  operational  testing  as  performed  by  OTEA. 

The  purpose  of  operational  testing  is  to  provide  data  upon  which 
to  estimate  a prospective  system's  military  utility,  operational  effec- 
tiveness and  suitability,  and  need  for  any  modifications  [2].  This  data 
is  obtained  through  a sequence  of  three  operational  tests  (referred  to 
as  OT  I,  OT  II,  and  OT  III).  Each  test  must  be  completed  and  analyzed 
prior  to  beginning  the  next  test  to  determine  if  there  is  a need  for 
the  next  test  in  the  sequence.  When  possible  the  new  system  is  tested 
alongside  the  existing  system  during  each  phase  of  testing  to  acquire 
data  from  both  systems  under  identical  conditions.  At  the  end  of  each 
test,  the  data  is  collected  and  analyzed,  and  a decision  is  made  to 
conduct  the  next  test  or  to  reject  the  new  system  111. 
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The  overall  assessment  procedure  consists  of  identifying  certain 
measures  of  effectiveness  (MOL)  which  are  critical  to  the  system  under 
consideration,  such  as,  percent  of  target  hits,  mean  miss  distance, 
mean  time  between  failure,  and  so  on.  Once  identified,  these  MOE  are 
incorporated  into  a test  design  which  will  provide  for  a side-by-side 
comparison  of  the  competing  systems  with  respect  to  each  MOE.  After 
all  MOE  of  interest  have  been  tested,  the  overall  desirability  of  the 
system  is  then  evaluated. 

For  a given  test  design,  the  problem  at  hand  is  one  of  determin- 
ing the  minimum  number  of  replicates  (sample  size)  required  for  each 
set  of  experimental  conditions  to  achieve  a specified  level  of  confi- 
dence in  the  inference  made  as  a result  of  the  experiment.  This  sample 
size  is  currently  being  determined  by  classical  statistical  procedures 
(18).  As  an  example,  suppose  the  random  variable  of  interest  is  assumed 
to  follow  a normal  distribution  with  unknown  mean  and  variance,  and  the 
decision  maker  is  interested  in  determining  the  expected  value  or  mean 
of  the  random  variable.  In  the  classical  sense,  the  mean  is  considered 
an  unknown  constant.  The  power  of  the  test,  or  the  probability  of 
rejecting  the  hypothesized  value  of  the  mean,  when  the  hypothesized 
value  is  inaccurate,  is  determined  from  the  operating  characteristic 
curves  for  the  type  of  test  conducted.  The  above  theory  of  classical 
statistics  will  be  important  when  compared  to  the  Bayesian  theory  inves- 
tigated in  this  study. 

Objectives  of  Research 

The  objectives  of  this  research  are  twofold.  The  first  objective 
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is  to  determine  whether  or  not  Bayesian  methodology  can  be  effectively 
applied  to  operational  testing.  As  noted  earlier,  operational  testing 
is  conducted  in  three  phases,  and  many  times  the  same  measures  of  effec- 
tiveness are  examined  in  more  than  one  phase.  The  current  procedures 
used  by  OTEA  consider  each  test  in  the  sequence  independently;  i.e., 
the  inferences  made  at  the  end  of  ea^-h  test  are  based  on  the  data 
obtained  during  that  specific  test  only  1181.  There  is  no  attempt  made 
to  combine  the  data  on  a specific  MOE  measured  in  OT  I and  OT  II.  for 
example,  to  obtain  a better  estimate  for  the  MOE  from  which  better 
inferences  can  be  made.  Chapter  III  is  devoted  to  developing  a 
methodology  which  will  apply  Bayesian  techniques  to  the  combination  of 
data  from  two  phases  of  testing  to  determine  the  power  of  a hypothesis 
test  for  any  specified  sample  size. 

The  second  objective  of  this  research  is  to  determine  under  what 
conditions  the  Bayesian  methodology  will  produce  a "better"  test  than 
the  classical  methodology  when  considering  the  same  sample  size  for 
both  methods.  Chapters  III  and  IV  are  devoted  to  comparing  the  above 
methodologies  in  the  context  of-  an  actual  test  conducted  by  OTEA. 

Fundamentals  of  Bayesian  Analysis 
The  discussion  presented  here  will  compare  classical  statistical 
theory  to  Bayesian  statistical  theory  to  demonstrate  how  OTEA's  present 
concepts  of  testing  would  have  to  be  altered  to  apply  Bayesian  tech- 
niques to  operational  testing.  Presently,  if  OTEA  is  considering  a 
data  generating  process  which  may  be  modeled  by  the  normal  process  with 
unknown  mean  and  variance,  then  the  probability  density  function 

i 
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.\sso(.  j at-i'd  wit.ti  the  |)r(H'ess  is  the  normal  detisity,  with  mean,  n , and 
varianee,  a^.  These  parameters  would  he  viewed  as  unknown  consjants 
hy  I tie  ilassieal  statistician.  Ttiese  cotistants  are  i|enerally  estimated 
Ity  sampl  iiKi  t»’om  the  data  ijenerat  itu)  process  and  usitui  the  sample 
statistics  X and  s to  estimate  m and  a' , respect  i V('ly.  If  one  is 
interest  I'd  in  u,  the  mean  ot  the  process,  X and  s^  could  he  used  to 
construct  a confidence  interval  on  n,  lor  example,  if  (1-a)  is  the 
deoree  of  confidence  desired,  then  ll?l 

I - a I'lX  - (<,,/2,„.iHs/fn)  • u • X t 1 (1-1) 

where  n is  the  sample  size  and  t , is  the  percentage  ixiint  of  the 

a/i;  ,tl-  I ■ 

central  t-distrihution  with  n-1  decrees  of  freedom  such  that 

!''(<  t ,)  a/2.  This  confidence  interval  on  u would  be  inter- 

a/ d , n-  I 

preted  in  the  rel.itive  frequency  sense.  That  is,  it  repeated  samples 

of  size  n were  taken,  each  time  compuf  inq  new  values  of  X and  s^' , and 

,1  confidence  interval  on  u was  constructed  after  each  sample  was  taken, 

then  it  would  he  expected  th.it  100(1  -a)'!  of  the  confidence  intervals 

so  (onstructed  would  contain  the  "true"  value  of  w 1121.  The  llayesian 

.inalyst  would  differ  in  several  ways.  He  would  consider  the  unknown 

parameters,  u and  a^,  as  random  v.irial'les.  ("Tildes"  will  he  used  to 

indicate  random  variables  throuijhout  this  study.)  Since  ('oint  estimates 

of  random  vai'iahles  are  useless,  lu'  would  .iscrihe  to  them  a probability 

di stri Init ion  instead.  If  prior  sampl inq  information  is  not  available, 

the  analyst  must  use  his  sub.jective  knowledqe  of  the  process  to  assess 

•) 

.1  probability  d i s tr itnit i on  for  the  joint  occurrence  of  u and  o'.  Ihis 


prior  distribution  can  then  be  combined  with  sample  information  to 
produce  new  distributions  for  the  unknown  parameters,  as  will  be 
demonstrated  below.  The  conceptual  differences  between  the  classical 
and  the  Bayesian  analyst  play  important  roles  in  interpreting  the 
resul ts  of  a test  [36] . 

The  combination  of  a prior  probability  distribution  of  a ran- 
dom variable  with  sample  information  is  achieved  by  use  of  Bayes' 
theorem.  For  a continuous  random  variable,  S,  Bayes'  theorem  may  be 
written  as 

f"(0|y)  = LiiM/Li) , (1.2) 

J f ' (e)f(yl 0)do 

~ oo 

where  a single  prime  superscript  (')  denotes  a prior  distribution  or 
parameter,  a double  prime  superscript  (")  denotes  a posterior  distri- 
bution or  parameter,  and  no  superscript  denotes  a sampling  distribution 
on  parameter.*  Therefore,  in  equation  (1-2),  f'{o)  is  the  prior  dis- 
tribution of  0 representing  the  analyst's  beliefs  regarding  e prior 
to  sampling,  f(y|o)  represents  the  likelihood  function  chosen  to 
describe  the  sampling  process,  and  f"(o|y)  is  the  posterior  distribution 
of  0 representing  the  analyst's  beliefs  regarding  0 after  sampling  [36]. 
The  theorem  can  also  be  applied  to  discrete  random  variables  by  sub- 
stituting probability  mass  functions  for  probability  density  functions 
and  a summation  sign  for  the  integral  sign.  Winkler  [36]  gives  a 

★ 

Appendix  1 presents  a detailed  explanation  of  all  notation  in  this 
study. 


derivation  of  Bayes'  theorem  from  conditional  probability  formulas. 
In  applying  Bayes'  theorem,  the  major  difficulties  lie  in  assessing 
the  prior  distribution  and  likelihood  function  and  in  evaluating  the 
integral  in  the  denominator  of  the  formula.  Baker  [4]  has  suggested 
methods  for  handling  these  difficulties  which  are  discussed  in  the 
next  chapter  and  which  will  be  used  in  this  study. 


CHAPTER  II 


BAYESIAN  DISTRIBUTION  THEORY 

In  his  thesis.  Baker  [4]  considered  a problem  similar  to  the 
one  addressed  in  Chapter  I.  He  has  proposed  a methodology  for  combin- 
ing data  relative  to  a single  MOE  taken  from  one  phase  of  testing  with 
sample  information  on  the  same  MOE  taken  in  a later  phase  of  testing. 
This  procedure  produces  an  estimate  of  the  MOE  for  use  in  making 
decisions.  The  methodology  applies  to  an  operational  test  in  which 
a proposed  system  is  being  tested  side-by-side  with  the  system  it  has 
been  designed  to  replace,  and  a single  MOE  is  under  consideration. 

In  general,  this  methodology  uses  the  theory  of  selecting  a prior  dis- 
tribution from  the  natural  conjugate  family  of  distributions  which, 
when  combined  with  the  likelihood  function  in  Bayes'  theorem,  produces 
a posterior  distribution  that  will  be  of  the  same  form  as  the  prior. 
This  will  reduce  the  computational  burden  considerably  in  the  sequen- 
tial analysis  used  in  this  study.  (For  a complete  discussion  of 
natural  conjugate  distributions,  see  Raiffa  and  Schlaiffer  [29], 

Chapter  3. ) 

In  this  study,  the  results  of  an  actual  operational  test  are 
supplied  by  OTEA.  When  considering  a single  MOE,  OTEA  assumes  the  uni- 
variate normal  distribution  with  unknown  mean  and  variance  as  the  basic 
model  for  sample  size  determination  for  both  measurement  and  attribute 
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data  [18].  The  same  function  will,  therefore,  be  used  in  this  study 
as  the  likelihood  function  for  the  random  variable  under  considera- 
tion. 

The  side-by-side  nature  of  the  operational  tests  under  con- 
sideration suggests  that  inferences  be  drawn  from  the  difference  of 
performance  characteristics  of  the  systems  rather  than  from  the  actual 
performance  characteristics  of  a single  system.  Thus,  if  X-j  and 
represent  the  same  MOE  for  systems  one  and  two,  respectively, 

D = X^  - X^  will  represent  the  difference  between  the  MOE  of  the  two 
systems.  Since  and  are  assumed  to  follow  the  normal  distribution 
with  unknown  mean  and  variance,  D,  which  is  just  a linear  combination 
of  two  independent,  normally  distributed  random  variables,  can  also  be 
assumed  to  follow  a normal  distribution  [12]  with  unknown  mean  and 
variance,  say  u and  a , respectively.  The  variable  of  interest  in  this 
study  will  be  m,  the  mean  difference  between  the  two  systems. 

In  the  classical  sense  n,  the  mean  of  the  distribution  of  D,  is 
considered  to  be  an  unknown  constant,  and  inferences  are  drawn  from 
tests  of  hypothesized  values  of  u.  Consequently,  if  u can  be  shown  to 
be  equal  to  zero,  one  can  conclude  that  there  is  no  difference  between 
the  competing  systems,  whereas  if  u is  not  equal  to  zero,  then  one  can 
conclude  that  one  system  is  better  than  the  other. 

In  the  Bayesian  sense,  since  u is  considered  as  a random  vari- 
able, tests  on  whether  or  not  p takes  on  a specific  value  are  mean- 
ingless. One  must  consider  tests  where  u can  take  on  a range  of 
values;  e.g.,  u ^ or  one  can  consider  a test  on  specific  values  of 
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the  mean  of  p.  If  u can  be  shown  to  be  equal  to  zero,  one  could  reason- 
ably conclude  that  there  is  no  difference  between  competing  systems, 
and  if  m f 0,  there  is  a difference. 

It  has  been  shown  [291  that  when  [I  is  considered  as  a random 
variable,  the  distribution  of  m is  the  Student's  t distribution,  rep- 
resented by  the  density 

f([i|m,v,n,v)  = f^(u|m,n/v,v),  (2-1) 

where  (m,v,n,v)  is  the  statistic  resulting  from  a sample  of  size  n 
and  is  given  by 


n.  = * » i .1^  D.  (2-2) 

1 ? 

V = ^ - m) 

V = n - 1 . 

The  parameters  (m,n/v,v)  in  the  argument  of  f^  on  the  right  side  of 
equation  (2-1)  indicate  the  degree  of  non  centrality  of  the  distribu- 
tion. The  central  or  standard  Student's  t distribution  would  be  given 
by  f^(ri  |0,1  ,v) . The  distribution  given  in  equation  (2-1)  can  be 
standardized  so  that  cumulative  t tables  can  be  used  in  computing  prob- 
abi 1 i ties  as  follows: 

P(vT  < u|m,v,n,v)  = F^*(  [y-m]/n7v  | v) , 


1 


t 


where  the  subscript  S*  indicates  the  standard  Student's  t distribution. 
It  has  also  been  shown  [29]  that  the  mean  and  variance  of  li  are  given  by 
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E(M|m,v,n,v)  = IT  = m v > 1 (2-3) 

V(u  lm,v,n,v)  5 d = ^ V > 2 

* 

The  objective  of  this  methodology  is  then  to  determine  the  minimum 
sample  size  which  will  produce  a posterior  distribution  of  y that  will 
enable  the  decision  maker  to  achieve  a specified  level  of  confidence  in 
the  inference  drawn  concerning  y- 

Since  the  Department  of  the  Army  has  imposed  on  OTEA  the  require- 
ment that  operational  testing  be  independent  of  all  other  testing  [21, 
it  has  been  assumed  that  prior  to  OT  I the  state  of  knowledge  concerning 
y can  be  represented  by  a diffuse  distribution  for  the  normal-gamma 
family,  as  developed  in  Winkler  [361.  Thus,  when  the  prior  distribu- 
tion is  combined  with  the  sample  information  from  OT  I the  resulting 
posterior  distribution  will  also  be  normal-gamma  [36].  When  a measure 
of  effectiveness  that  was  considered  in  OT  I is  being  reconsidered  in 
OT  II,  it  must  be  assumed  that  the  posterior  standard  deviation  of 
y,  /J",  determined  in  OT  I was  too  large  to  reach  a meaningful  conclusion 
about  y.  The  sequential  nature  of  the  testing  then  presents  the  oppor- 
tunity to  use  the  posterior  distribution  determined  from  OT  I regarding  y 
as  the  prior  state  of  knowledge  of  y for  OT  II.  The  methodology  now 
concentrates  on  developing  a sample  size  for  OT  II  which  will  produce  a 
posterior  standard  deviation  for  v'l  equal  to  some  fraction  of  the  prior 
standard  deviation;  i.e.,  /y"  = s/y' , where  0 < s ^1. 

Baker  [4]  has  shown  that  a sample  of  size 

n r (-y  - 1 )n ‘ , 0 < s ^ 1 


n 


where  n'  represents  the  sample  size  of  the  prior  distribution,  can  be 
expected  to  reduce  the  prior  standard  deviation  of  u by  a factor  s. 

He  approximated  E(  / u")  with 


E( 


(2-4) 


Due  to  the  approximations  used  in  his  formulation.  Baker  [4]  has  intro- 
duced an  error  into  the  expected  posterior  standard  deviation  which  can 
be  written  as 


% error  = 1 - exp  [-3/4  ( (-r^)  - (7^7^))]  • (2-5) 

If  this  error  is  determined  by  the  decision  maker  to  be  too  large,  then 
equation  (2-4)  cannot  be  used,  and  a more  complex  formula  must  be  used 
to  determine  the  sample  size,  n,  which  will  produce  a desired  expected 
posterior  standard  deviation  of  Ci.  This  equation  is 

E[ /r  |m',v',n',v';n,v]  = /(V^/n")5'  exp  -[|-((^)-  ) ],  (2-6) 

where  n"  = n'+n. 

Although  equation  (2-6)  cannot  be  solved  explicitly  for  n,  given 
a desired  value  of  E(/^),  it  can  be  solved  iteratively.  Baker  has  sug- 
gested a starting  value  of  n to  be  that  found  by  solving  equation  (2-4) 
for  n. 

Once  a sample  size  has  been  determined  and  a sample  has  been 
taken,  the  statistic  (m",v",n",v")  is  determined  [29]  as  follows: 


5P 
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n 'tn'  +nni 
n '+n 


(2-7) 


n"  = n'  + n 

„ _ [v'v'+n'(nr)^]  + (w+nm^)  - n'‘(in")^ 
r(^(n')]  + [v  +'  6(n)]  - 6(n") 

v"  = [v'  + 6(n')]  + [v  + 6(n)]  - 6(n"), 


where  6(’y ) = 


Co 


if  Y 


0 


1 if  Y > 0 . 


The  mean  and  variance  of  the  posterior  distribution  of  vl  are  then 


E(u")  H = m" 


(2-8) 


n'u") 


" h'‘(v"-2)  • 


In  the  case  where  the  prior  distribution  is  diffuse,  as  in  OT  I, 
n'  = v'  = 0,  and  the  posterior  parameter  (m" ,v" ,n" ,v" ) equals  the 
sample  statistic  (m,v,n,v)  [29]. 

The  above  development  is  directed  at  producing  a value  of  the 
posterior  standard  deviation  of  p which  will  make  the  distribution  of 
p "tight"  enough  to  enable  the  decision  maker  to  make  his  decision  con- 
cerning p with  a specified  degree  of  confidence.  However,  the  value  of 
/p"  which  satisfies  the  above  criterion  is  subjective  in  nature.  The 
problem  of  determining  values  of  /p"  which  meet  certain  criteria  will 
be  discussed  in  Chapter  III. 
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CHAPTER  III 

CLASSICAL  VS.  BAYESIAN  HYPOTHESIS  TESTING 

Introduction 

In  this  chapter  an  attempt  will  be  made  to  compare  Bayesian  and 
classical  statistical  methods  in  the  context  of  hypothesis  testing. 

One  commonly  accepted  measure  of  comparison  between  methods  of  testing 
hypotheses  is  the  power  of  each  test.  We  shall  define  the  power  of  a 
test  as  the  probability  of  rejecting  the  null  hypothesis  when  it  is 
false,  or,  equivalently,  the  probability  of  not  committing  a type  II 
error.  The  power  of  the  test  is  an  appropriate  measure  of  comparison 
for  this  study  because  of  the  consequences  of  the  decisions  resulting 
when  type  II  errors  are  made.  In  the  case  of  operational  testing,  con- 
sider the  null  hypothesis:  there  is  no  difference  between  the  standard 
equipment  and  its  proposed  replacement  versus  the  alternate  hypothesis: 
the  proposed  replacement  is  better  than  the  standard.  If  the  decision 
maker  makes  a type  II  error  (i.e. , the  new  equipment  is  better  but  it 
will  not  be  purchased),  he  is  denying  the  army  the  use  of  a better  piece 
of  equipment  and  thereby  keeping  the  level  of  mission  accomplishment 
lower  than  it  could  be. 

In  the  case  of  a type  I error,  however,  where  the  decision  maker 


rejects  the  null  hypothesis  when  it  is  true  (i.e.,  there  is  no  differ- 
ence in  equipment  but  the  new  equipment  is  purchased),  the  consequence 
would  be  that  a probably  more  expensive  piece  of  equipment  would  be 
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purchased  which  would  not  improve  the  mission  accomplishment  of  the 
army.  A better  piece  of  equipment  would  not  have  been  overlooked, 
however. 

In  this  example,  a type  II  error  could  be  more  harmful  to  the 
army  than  a type  I error.  For  this  reason,  the  probability  of  not 
committing  a type  II  error,  or  the  power  of  the  test,  is  considered  of 
prime  importance  in  this  study. 

The  Two-tailed  Hypothesis  Test 

To  compare  classical  versus  Bayesian  tests  in  terms  of  power,  the 
hypotheses  of  interest  in  both  tests  must  be  considered.  In  the  classi- 
cal two-tailed  test,  u = 0 vs  : u ^ 0 (u  is  considered  a constant), 
the  type  I error  can  be  fixed  at  any  desired  level,  and  the  type  11 
error  can  be  determined  for  any  given  sample  size  by  use  of  the  appro- 
priate operating  characteristic  curves.  However,  since  the  Bayesian 
considers  u to  be  a continuous  random  variable,  the  probability  that 
u = 0 will  always  be  zero.  In  fact,  Winkler  [36]  has  stated  that  there 
is  no  logical  Bayesian  equivalent  to  the  classical  two-tailed  test.  Two 
modified  Bayesian  hypotheses  will,  therefore,  be  considered  in  this 
study.  The  first  tests  whether  or  not  the  mean  of  jj,  w,  equals  zero; 
i.e.,  u = 0 vs  : p 0.  Since  the  variance  of  [1  decreases  as 
n increases,  an  infinite  sample  would  yield  exact  knowledge  of 
the  true  Ji.  In  the  infinite  sample  case,  the  mean  of  ii  would  be  the 
exact  value  of  p when  the  variance  of  p is  zero.  It  is,  therefore, 
logical  to  compare  the  value  of  7 in  the  Bayesian  test  to  the  value  of 
p in  the  classical  test.  This  will  be  done  in  the  next  section. 
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The  second  modified  Bayesian  hypothesis  tests  whether  or  not  m 
lies  in  some  interval  about  zero;  i.e.  -a  ;<  ii  _<  a vs  : -a  > il 
or  a < p,  a > 0.  Since  the  classical  decision  maker  is  really  more 
interested  in  knowing  whether  p is  in  some  small  interval  about  zero 
rather  than  if  p is  exactly  equal  to  zero,  this  Bayesian  hypothesis 
would  also  serve  as  a valid  comparative  to  the  classical  two-tailed 
test.  This  comparison  will  be  discussed  in  connection  with  the  one- 
tailed  test  later  in  this  chapter. 

Solution  Using  Bayesian  Prediction  Interval 

In  this  section,  the  hypotheses  = 0 vs  H-j : p ^ 0 in  the 

Bayesian  context  will  be  compared  with  the  hypotheses,  p = 0 vs 

; p ^ 0 in  the  classical  context.  The  measure  of  comparison  will 

be  the  power  of  the  test.  As  stated  in  Chapter  II,  5,  the  difference 

between  the  same  MOE  of  two  competing  systems,  is  assumed  to  follow  a 

2 

normal  distribution  with  unknown  mean,  p,  and  unknown  variance,  a . In 
the  classical  test,  a sample  size  can  be  determined  which  will  yield  a 
specified  power  for  the  test  for  any  fixed  type  I error,  a.  The  rejec- 
tion criteria  for  H^,  established  from  the  a level  desired,  is  1121; 
reject  if  jtj  > t^/2,n-r 

t = test  statistic  = 

° V//F 

1 " 

m = sample  mean  = ^ I 

1 2 

V = sample  variance  = I (D^-m) 
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t , = value  of  t such  that  P(ltl  > t ,o  „ , ) = a/2 

a/^,n-l  ' a/2,n'l 

The  power  of  the  test  for  various  sample  sizes  and  departures  of  u from 
0 are  given  by  the  appropriate  operating  characteristic  curves  for  the 
2-tailed  t test  in  [12]. 

Before  defining  the  power  of  the  Bayesian  test,  some  discussion 
of  a Bayesian  prediction  interval  is  needed.  A Bayesian  prediction 
interval  (BPI)  is  an  interval  having  a stated  probability,  e.g.,  (l-y), 
of  containing  the  variable  of  interest.  In  Figure  1,  u"  is  the  mean  of 
the  posterior  distribution  of  jj,  a^  is  the  lower  prediction  limit,  ^ is 
the  upper  prediction  limit,  and  the  shaded  area  is  the  probability  that 
a < jj"  < b. 


Figure  1.  Generalized  Bayesian  Interval  on  u". 

If  the  interval  is  centered  on  u",  the  length  of  the  interval,  d",  is 
given  by  [121 

d"  - . (3-1) 


When  considering  the  Bayesian  hypotheses,  = 0 vs  / 0,  the 

rejection  criteria  to  be  used  in  this  section  will  be:  reject  if 
zero  does  not  fall  in  the  (1-y)  BPI  on  u".  The  type  I error, a,  is 
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a = P (reject!  ng  I , . 

which  can  be  restated  as 

it  = P(0  is  not  in  (1-!)BP1]m  = 0';.  (3-2) 

The  power  of  the  test  is  defined  as 

Power  = P (rejecting  = c / 0), 
which  can  be  restated  as 

Power  = P (0  is  not  in  (I-y)BPIIu  = c).  (3-3) 

Using  d"  from  equation  (3-1),  the  power  becomes 

Power  = P (0  is  not  in  i nterval  [p" - d"/2 , p"+d"/2 ] | u = c ) . (3-4) 

Since  u"  = m"  (defined  by  equation  (2-7)),  equation  (3-4)  becomes 

Power  = P (0  is  not  in  interval (m"  - d"/2,  m"  + d"/2  1 1 p = c) . (3-5) 

Prior  to  sampling,  m"  and  d"  are  random  variables,  denoted  m"  and  8", 
which  lead  to 

Power  = P (0  is  not  in  interval [m"  - d"/2,m"  + d"/2] jp  = c).  (3-6) 

Since  zero  will  not  be  in  the  BPI  only  if  the  end  points  of  the  BPI 
have  the  same  sign. 

Power  = P (m"  - d"/2  < 0 and  m"  + d"/2  < 0|p  = c)  + (3-7) 

P(m"  - d"/2  > 0 and  lii"  + d'72  - 0|p  = c)  . 
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I ion  (.1-/)  is  (H)u  i va  I ont  to 


I’owor  P 


(.VH) 


Sol'S  t 1 1 Ilf  i n<i  thi'  v.ili/i'  of  <)"  (livoii  in  t’i|ual  ion  (.1-1), 


I’owor  l’(  |in"l  ■ t •"In  ^ • 


(.l-'l) 


Si  mo  . is  always  (|roat  or  than  0. 


r 


I’owor  r{lni"l/.  n"  ^ /i>  1 >'  ) 


(-1-10) 


It  has  I'l'on  shown  1 1")  | that  m"  follows  a non  ii'iitral  t d i st  r i hu  t ion  .ind 

/ 

. i*i"  follows  an  invortod  hota  1 distrihut  ion.  It,  is.  thori'fon',  vory 

iliMirnlt  to  lalnilato  tin'  powi'r  ot  t ho  tost  from  tin'  ('sprossion  oivon 

in  oipiat  ion  (.1-10).  lo  siiiiplity  t lu'  i .i  1 on  1 at  i ons , . p"  will  ho  ri'plaood 

hy  its  oxpc'olod  valni'.  as  ()ivon  in  ('ipiat  ion  (P-4),  and  I ho  rosultiin] 

powor  ( oinpntat  ion  is  oonsidori'd  to  ho  an  ai'prox  imat  ion  to  t ho  powor  in 

r 

oiin.ition  (.1-10).  Attor  roplaoiiu]  ■ 'i"  witli  its  oxpootod  valno  as  <iivon 
III  I'lpiat  ion  (<’-4)  and  h’t.timi  k t ,,,  „ l(  > n"),  ('ipiiit  ion  (.!-") 
hm  oiiios 


llsiini  III"  as 


I’owor 


I’owor  l’(liii"| 
I’ (in"  ■ 

nivoii  in  ('nuat  imi 

p , in' n ' I inn 

' *i  " 


■ k I 11  I'  ) 

k||i  o)  * l’(in"  ■ -k|ii  c ) 
(P-/).  oi|nalion  (.1-IP)  hoininos 


(.1-11) 


(,1-lP) 


(.1-1.1) 


1 (|uivalonl  ly. 


Jil.lU.ii  III  I 
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Power  = P(m  - |,  = c)  . P(m  < | 17=  c)  (3-14) 


- 1 n/ T,  kn"-nrn'  ^ n,  ~ -kn"-m'n'|-  , n 

= 1 - P(m  < ^ |ii=c)  + P(ni<— — — — |m  = c)  . (3-15) 


It  has  been  shown  1291  that  the  distribution  of  m is  given  by 


D(mlm' ,v' ,n' ,v' ;n,u)  = f5(ni|m' .n^/v' ,v' ),  (3-16) 


I _ n n 

where  n = v-i  • 
u n+n 


Strictly  speaking,  the  Bayesian  analyst  does  not  consider  77  to  be  an 
unknown  constant  with  some  true  value.  Rather,  he  considers  77  to  be  a 
random  variable  also.  However,  in  order  to  use  Bayesian  procedures  to 
formulate  a test  which  can  be  compared  to  the  classical  hypothesis  test, 
it  has  been  assumed  that  there  is  some  true  value  for  77.  With  this 
assumption,  the  expected  value  of  the  sample  mean,  E(m),  would  then  be 
equal  to  u.  Cumulative  probabilities  for  m would  then  be  computed  as 
given  below. 


P(m  < a|(i,n  ,v',  v' ) = F *(  (a  - u n777v'' | v' ) 


(3-17) 


Using  equation  (3-17)  with  ii  = c,  equation  (3-15)  can  be  rewritten  as 


n 1 r / kn"  - m'n ‘ - cn  I ,,  . r -kn"  - m' n ' - cn  , ,,  ,, 

Power  = 1 - Fc*  ( - - - - - - v ) + Fc*f v ).  (3-18) 

^ n/vVT  ' ^ ^ nvV‘7T7 

u u 


Summarizing  the  method  for  dcteniiininq  the  power  of  the  Bayesian  test 
for  a given  sample  size  and  a prior  statistic  (m' ,v ' ,n ' , v' ) : 
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1.  Calculate 


E(v^u") 


from  equation  (2-4] 


hA")  - 


2.  Calculate  v"  and  n^  from  equations  (1-5)  and  (3-16). 

v"  = n'  + n - 1 

n^n_ 

^u  n+n* 

3.  Calculate  k = t,  „ E(>'^'). 

1/2,  V 

4.  Calculate  power  from  equation  (3-la)  for  any  value  of  c. 


11 lustratin^  the  Procedure 

In  this  section,  the  solution  procedure  described  above  will  be 
illustrated  in  the  context  of  an  actual  operational  test  conducted  by 
OTEA.  The  test  selected  was  an  OT  II  for  the  Lightweight  Company  Mortar 
System  (LWCMS),  which  is  being  considered  as  a replacement  for  the  81  mm 
mortar  currently  being  used  by  the  army.  The  purpose  of  the  test  was  to 
provide  data  for  a side-by-side  comparison  of  the  two  mortars  to  assess 
the  relative  operational  performance  and  military  utility  of  the  LWCMS 
1201.  One  of  the  MOE  which  was  considered  in  both  OT  I and  OT  II  was 
the  time  required  for  an  individual  to  complete  the  gunner's  examina- 
tion, which  is  a test  designed  to  determine  how  quickly  an  individual 
can  perform  critical  operations  in  preparing  a mortar  to  fire.  In  OT  1 
a sample  size  14  was  used  to  determine  the  distribution  of  times  to  per- 
form the  gunner's  test.  The  results  of  this  test  are  contained  in 
Appendix  2,  If  and  represent  the  times  to  perform  the  test  on  the 
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old  system  and  the  new  system,  respectively,  then  5 = is  the 
variable  described  in  Chapter  II.  The  mean  of  D,  p,  is  the  variable 
of  interest  in  this  study. 

Using  a diffuse  prior  distribution.  Baker  [4]  determined  the 
parameters  of  the  posterior  distribution  of  p for  OT  1 from  equation 
(2-7)  to  be 


m"  = m = 1 7.6  sec 
n"  = n = 14 
v"  = 2040.5  sec^ 


v"  = 13 

Since  the  same  MOE  was  also  tested  in  OT  II,  the  above  values  will  be 
used  in  the  prior  distribution  of  p for  OT  II.  The  value  of  the  prior 
variance  of  p for  OT  II  is  computed  from  equation  (2-3). 


M = 


V-  _ (2040.5)(13)  . oc 

T^2T  ~n4Tnt)“’  ■ ' 


sec 


In  OT  II,  OTEA  used  a sample  of  30  individuals,  each  of  whom  per- 
formed the  gunner's  test  twice  on  each  of  the  competing  systems.  The 
average  times  for  each  individual  on  each  system  are  given  in  Appendix 
3.  The  power  curve  for  the  classical  two-tailed  test  with  n = 30  will 
be  compared  to  the  power  curve  for  the  Bayesian  test  with  n = 30.  The 
step-by-step  procedure  for  calculating  the  power  using  the  statistic 
(m' ,v' ,n' ,v' ) = (17.6,  2040.5,  14,13),  n = 30,  and  a 95%  Bayesian  predic- 
tion interval  (>  = .05)  is  given  below. 


A 
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1. 


E(v^  m")  = 


{172.25)(14) 

30+14 


= 7.40  sec 


2.  v"  = 14  + 30  - 1 = 43 


_ 30(14) 
u 30+14 


9.55 


3.  k = t_Q25,43  E(/^")  = 2.02(7.40)  = 14.95 

4.  Power  = 1 - F.*,  1 L4^Wia:_( 

^ 30/2040.5/9.55 


+ f (-14.95)(44)  - (17.6)(14)  - 30c 
^ 30/2040.5/9.55 


13] 


The  cumulative  distribution  for  the  standard  Student's  t distribution 
is  given  in  Biometrika  Tables  for  Statisticians,  Volume  1,  by  Pearson 
and  Hartley  [22].  The  power  for  c = 20  is  calculated  to  be 

Power  = 1 - F^*  (-.2113)  + Fg*(-3.9ll3) 

= 1 - .42  + .0009 
= .58 


The  power  for  other  values  of  c is  calculated  in  a similar  manner.  Since 
the  value  of  y in  the  (1-y)  BPI  is  not  the  type  I error  for  this  test, 
the  type  I error  must  also  be  calculated  for  each  value  of  n.  The  type 
I error,  a,  is  given  by 


a = P(rejecting  H^|ii=  0) 


= P(0  is  not  in  (I-y)BPI  ||j  = 0) 


T 
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This  formula  is  the  same  as  the  formula  for  the  power  with  y = c = 0. 
Therefore, 


cx  = 1 - 


(14.95)(44)  - (17.6)(14) 
30/2040.5/9.55 


13] 


+ F.*  [ (-14.95)(44)-  (17.6)(14) 
^ 30/^40.5/9.55 


13]  , 


= 1 - (.94113)  + F^*(-2.lll3) 

= 1 - (.82)  + .03  = .21 


In  order  to  fix  a at  a certain  level,  as  is  done  in  the  classical  case, 
the  width  of  the  (1-y)  BPI  must  be  changed  with  each  value  of  n;  i.e., 

Y must  change  with  each  n to  keep  a fixed.  To  calculate  the  value  of  y 
which  produces  a given  a,  consider  eq.  (3-18)  with  c = 0. 


a 


kn"  - m'n‘ 
n/v'/n 

u 


Fs*( 


-kn"  - m'n'  I ,, 
n /v  /n 

u 


(3-19) 


For  positive  m' , the  last  tenii  in  equation  (3-19)  is  insignificant. 
Therefore,  letting  a = .05  and  dropping  the  last  term  yields 

.95  = F^*(  —-—^^^113)  . (3-20) 

rit'V'/n 

u 

The  value  of  the  argument  in  the  right  side  of  equation  (3-20)  which 
yields  a probability  of  .95  is  1.8  [22].  Thus 

kt)"  - m'n'  _ 1 g 
n/v'/n^j 
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Equivalently, 


1 .8  n/v'/n  + in'  n ' 
'y 


For  n = 30, 


1.8(30)/2040.5/9.55  + (17.6)(14) 
30  + 14 


r JL-rxA.  > 


= 23.54  . 


Since  k = y"  E(  A")  by  defi ni tion , 


_ _k ^ 2J.^ 

'>/2,43  ■ A-  7.40 


= 3.18 


E(  /i") 


(3-21) 


Thus,  Y ' .005,  or  a 99.5%  BPI  will  produce  a type  I error  of  .05.  Table 
1 lists  the  values  of  k needed  to  produce  a = .05  for  various  values  of 
n. 


Table  1.  Sample  Size  versus  k,  a = .05 


Sample  Size 

k 

2 

23.08 

4 

23.93 

7 

24.28 

10 

24.30 

15 

24.13 

20 

23.91 

30 

23.54 

40 

23.27 

r 
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With  the  type  1 error  fixed  at  .05,  the  power  curves  for  the  Bayesian 
and  classical  tests  for  n = 30  and  n = 4 have  been  plotted  in  Figures 
2 and  3,  respectively.  From  the  figures  it  can  be  seen  that  for  n = 4, 
the  Bayesian  test  is  slightly  more  powerful  but  for  n = 30  the  classi- 
cal test  is  much  more  powerful.  Plots  of  the  power  versus  sample  size 
for  each  test  for  |c|  =20  and  |cl  =40  are  shown  in  Figures  4 and  5, 
respectively.  There  it  is  evident  that  the  classical  test  is  superior 
to  the  Bayesian  test  in  detecting  both  small  and  large  values  of  [c], 
particularly  when  large  values  for  the  power  are  required. 

We  will  now  investigate  the  behavior  of  the  power  curves  if  y 
in  the  (1  - y)  BPI  is  held  constant  at  y = .05.  In  this  case,  the  type 
I error  will  not  remain  fixed  as  it  did  in  the  previous  calculations. 

The  type  I error  can  be  computed  from  equation  (3-19)  for  various  sample 
sizes.  The  results  of  the  calculations  are  given  in  Table  2. 


Table  2.  Type  I Error  Versus  Sample  Size 


2 

.01 

4 

.04 

7 

00 

o 

10 

.11 

15 

.14 

20 

.16 

30 

.21 

40 

.25 

r 


30 


Plots  of  the  power  versus  sample  size  for  the  two  tests  for  |c|  = 20, 

30,  and  40  are  given  in  Figures  6,  7,  and  8,  respectively.  There  it 
can  be  seen  that  as  |c|  increases,  the  difference  between  the  two  curves 
decreases.  However,  as  seen  in  Table  2,  the  type  I error  for  the 
Bayesian  test  is  greater  than  that  for  the  classical  test  (.05)  for 
sample  sizes  greater  than  four.  Once  again,  the  classical  test  appears 
to  be  superior,  particularly  when  high  values  of  the  power  are  required. 

In  the  foregoing  example,  a 95%  BPI  was  utilized  in  computing 
the  power  for  the  Bayesian  test.  If  a larger  interval  is  used,  both 
the  power  and  the  type  I error  will  decrease.  This  is  obvious  from  equa- 
tions (3-3)  and  (3-2).  If  the  length  of  the  BPI  is  increased,  the 
probability  that  the  BPI  will  include  zero  must  increase.  Therefore, 
the  probability  that  the  BPI  will  not  include  zero  (or  power)  must 
decrease.  Similarly,  decreasing  the  length  of  the  BPI  will  increase  the 
power  and  the  type  I error.  Thus,  various  power  and  type  I error  com- 
binations can  be  achieved  by  varying  the  width  of  the  BPI. 

In  the  above  example,  the  variability  of  m was  affected  by  n^, 
as  defined  in  equation  (3-16).  In  Table  3 below,  the  difference  between 
n and  n^  can  be  seen  to  increase  as  n increases.  The  parameter,  n^, 
takes  into  effect  the  variability  of  m'  in  calculating  the  variability 
of  m,  which  is  given  by  [29] 

V(m|m’  ,v'  ,n  ,v'  ) = ^ ■.  , (3-22) 

u 

u nn ' 

where  n = rr  . 
u n+n 


Classical 


iiiiurc  1.  I'Dwt’r  vs.  Sani[ili>  Si/o,  |i  | 
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Table  3. 

Sample  Size  Versus  n^ 

Sample  Size 

% 

2 

1.75 

4 

3.11 

7 

4.67 

10 

5.83 

20 

8.24 

30 

9.55 

40 

10.37 

Since  we  are  considering  the  true  value  of  n to  be  a constant,  which  is 
the  expected  value  of  ni,  we  shall  next  investigate  how  the  power  of  the 
Bayesian  test  is  affected  if  the  variability  of  m' is  not  considered  in 
the  variance  of  lii;  i.e.,  n^  will  be  replaced  by  n in  equation  (3-18). 

Sol ution  Using  Alternate  Method 
Replacing  n^  with  n in  equation  (3-18)  yields 


Power  = 1 


[-  , kn"  - in'n' 

I- ^ ~~~ 

nvv'/n 


cn 


V ' ) + F^( 


-kn"  - mj  n ' - cn 
n/v'/n 


v') 


(3-23) 


The  type  I error  for  this  test  is  obtained  from  equation  (3-23)  with 
c = 0. 


Fs*( 


kn"  - m^n^ 
n/v'/n 


v') 


+ F-*( 


-kn"  - iii'n' 
n/v ' /n 


(3-24) 
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Illustrating  the  Procedure 

Using  the  same  sample  data  as  in  the  previous  sections,  the 

values  of  k required  to  keep  a = .05  are  obtained  from  equation  (3-21) 

with  n = n and  are  shown  below, 
u 

Table  4.  Sample  Size  Versus  k,  a = .05 


Sample  Size k 


2 

22.59 

4 

22.72 

7 

21.98 

10 

20.98 

15 

19.36 

20 

17.94 

30 

15.72 

40 

14.09 

With  the  type  I error  fixed  at  .05,  the  power  curves  for  n = 30  and 
n = 4 for  |c|  = 20  are  plotted  in  Figures  9 and  10,  respectively.  It 
can  be  seen  that  for  n = 4 the  Bayesian  test  is  more  powerful  and  for 
n = 30,  the  classical  test  is  marginally  more  powerful.  The  plots  of 
power  versus  sample  size  for  |c|  = 20  and  |c|  = 40  are  given  in  Figures 
11  and  12,  respectively.  From  these  curves,  it  can  be  seen  that  there 
is  little  difference  between  the  two  tests  in  terms  of  power.  Thus,  when 
the  variability  of  m'  is  not  considered  in  the  variability  of  m,  there 
is  no  significant  difference  in  the  power  of  the  two  tests.  There  has 
been  no  evidence  so  far  to  justify  using  Bayesian  instead  of  classical 


0 10  20  30 

Sample  Size 


Figure  12.  Power  vs.  Sample  Size,  |c|  = 
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procedures  in  the  case  of  the  two-tailed  hypothesis  test.  Another 
method  of  treating  the  two-tailed  test  will  be  discussed  in  connec- 
tion with  the  one-tailed  test  in  the  next  section. 

The  One-Tailed  Hypothesis  Test 

If  the  decision  maker  is  interested  in  the  classical  one-tailed 
test,  u 0 vs  : u > 0,  there  is  an  equivalent  Bayesian  test; 
namely,  p _<  0 vs  : u > 0.  In  fact,  an  alternate  method  for  test- 
ing the  two-tailed  hypothesis  also  falls  into  this  category.  Rather  than 
test  y = 0 vs  : u 0,  consider  -a  £ p _<  a vs  : p < -a  or 
p ' a,  a > 0.  This  really  tests  whether  p is  in  some  interval  about  zero 
and  can  be  treated  as  a special  case  of  the  one-tailed  test  discussed 
below. 

As  in  the  two-tailed  test,  the  type  I and  type  II  errors  for  the 
classical  one-tailed  test  can  be  determined  for  any  distribution  of  the 
random  variable  of  interest.  However,  in  the  Bayesian  test  once  a 
posterior  distribution  for  p has  been  determined,  the  probabilities  of 
and  being  true  can  be  determined;  i.e., 

0 

P(p  _<  0|sample  data)  = / f(p)dp  . 

— CD 

If  the  density  function  of  p is  known,  the  above  integral  can  be  com- 
puted. Additionally, 

P{p  > Ojsample  data)  = 1 - P(p  £ Ojsample  data). 


P(Hi  is  true)  = 1 - P(H^  is  true). 


Equivalently,  [36] 
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If  one  considers 


a = P{rejecting  is  true), 

and  one  knows  the  probability  that  is  true,  it  is  difficult  to 

justify  any  rejection  criteria  for  which  would  lead  to  a meaningful 

calculation  of  a.  Winkler  [36]  suggests  that  the  significance  level 
of  the  test  can  be  determined  by  measuring  how  "unusual"  the  sample 
result  obtained  is,  given  that  the  null  hypothesis  is  true.  Equiva- 
lently, one  could  determine  the  chance  of  obtaining  a sample  result  more 
"extreme"  than  the  one  observed,  given  H is  true.  In  the  test  consid- 
ered  in  the  previous  section,  for  example,  if  p = 0,  how  "unusual"  is 
the  sample  result  of  m = 72.95  sec?  (See  data  in  Appendix  3.)  The 
standardized  value  corresponding  to  m = 72.95  is 

t = = = 10.26  . 

° s//n  38.96//30 

Since  is  one-tailed  to  the  right,  the  significance  level  is  equal  to 
the  P (t^  ^ 10.26),  which  is  less  than  .0001.  The  smaller  the  signifi- 
cance level,  the  less  likely  the  sample  result  is,  given  that  is 
true  [361.  It  can  be  seen,  then,  that  the  significance  level  as 
defined  above  cannot  be  fixed  as  in  the  classical  test  since  it  depends 
on  the  sample  result.  Additionally,  there  is  no  clear  method  for  deter- 
mining a power  for  the  Bayesian  test.  As  stated  earlier  in  this  section, 
the  modified  two-tailed  test  can  be  considered  a special  case  of  the  one- 
tailed  test.  If  the  hypotheses  of  interest  are  -a  i p i a and 
p < -a  or  iT  a,  a > 0,  then  the  probability  that  is  true  is 
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a 

P(H^  is  true|sample  data)  = / f(M)dM  . 

•d 

When  the  posterior  distribution  of  jj  is  determined,  the  above  integral 
can  be  computed.  Obviously, 

P(H^  is  true|sample  data)  = 1 - P(H^  is  truejsample  data). 

The  arguments  given  for  determining  the  significance  level  and  power 
for  the  one-tailed  test  apply  as  well  for  the  modified  two-tailed  test. 

Since  there  is  no  meaningful  definition  of  power  available  for 
the  Bayesian  one-tailed  test,  it  is  necessary  to  determine  a different 
measure  of  comparison  between  the  classical  and  Bayesian  statistical 
procedures.  The  concept  of  minimum  loss  will,  therefore,  be  considered 
in  Chapter  IV  as  the  basis  for  comparison. 


L. 
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CHAPTER  IV 

CLASSICAL  VS.  BAYESIAN  ANALYSIS  WITH  LINEAR  LOSS  FUNCTIONS 

Introduction 

In  this  chapter,  a linear  loss  function  will  be  utilized  to 
compare  the  consequences  of  the  decisions  made  under  Bayesian  and 
classical  analyses  of  the  same  problem.  In  all  real  world  problems, 
there  are  certain  payoffs  or  losses  associated  with  decisions  made 
under  uncertainty.  When  the  decision  maker  is  not  sure  of  the  value 
of  a certain  quantity,  such  as  u in  the  analysis  in  the  last  chapter, 
he  is  subject  to  making  a decision  which  is  based  on  the  assumption  of 
the  wrong  value  of  y.  For  example,  if  the  null  hypothesis,  H^.-y^O, 
were  accepted,  causing  the  decision  maker  to  "reject  the  new  equipment, 
when  in  fact  the  true  y is  greater  than  0,  a certain  "opportunity"  loss 
is  experienced.  The  army  would  be  penalized,  in  that  it  would  not  have 
the  opportunity  to  use  a better  piece  of  equipment.  Even  though  it  is 
not  always  possible  to  attach  a monetary  figure  to  the  opportunity  loss, 
some  type  of  loss  function  must  be  considered  by  the  classical  decision 
maker,  at  least  subjectively.  When  the  decision  maker  determines  maxi- 
mum acceptable  levels  for  the  type  I and  type  II  errors  for  a test,  he 
is  indicating  the  relative  importance  of  each  type  of  error.  For  exam- 
ple, if  .05  and  .10  are  the  maximum  levels  for  the  type  I and  type  II 
errors,  respectively,  the  decision  maker  could  be  indicating  that  he 
considers  the  loss  associated  with  a type  I error  to  be  twice  as  great 

J 
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as  the  loss  associated  with  a type  II  error.  In  the  classical  analysis 
of  a problem,  however,  a decision  is  based  on  the  outcome  of  a hypothe- 
sis test  on  some  central  MOE,  rather  than  on  the  possible  losses  result- 
ing from  each  possible  decision.  Many  times  the  type  I error  is 
arbitrarily  set  as  some  low  value,  say  .05  or  .01,  and  the  power  of  the 
test  is  made  as  high  as  necessary  by  increasing  the  sample  size.  How- 
ever, in  considering  actual  loss  functions  formally,  the  decisions 
resulting  from  the  classical  and  Bayesian  approaches  to  the  problem  may 
differ  considerably.  The  linear  loss  function  will  be  considered  in 
this  chapter. 

Linear  Payoff  Function 

Before  considering  the  linear  loss  function,  a brief  discussion 
of  the  linear  payoff  function  is  needed.  In  considering  the  two  action 
problem  of  concern  in  this  study,  let  a.|  denote  the  action  of  rejecting 
the  new  equipment  in  favor  of  the  old,  and  let  a^  denote  the  action  of 
purchasing  the  new  equipment.  Define  linear  payoff  functions  as  in  [36], 
say 

R(a.|  , m)  = r.|  + s.|M  (4-1  ) 

R(a2,  11 ) “ r2  S2V1 

where  r^  and  s^.  are  constants  and  S2  > s.| . 

With  these  functions,  the  decision  maker  would  consider  the 
payoff  of  a certain  action  linear  with  respect  to  the  actual  state  of 
the  world,  p.  In  this  case  action  a^  would  be  optimal  if 


k A 


r 


I 

; ElR(a,)]  > E[R(aJl  (4-2) 

I 

I E(r^  + s^m)  > Elr^  + s^p) 

1 + s^E((l)  > + S2E(i:) . 

Subtracting  and  s^E(p)  from  both  sides  we  get 

r^  - r^  > E(y)(s2  - ) . 


Since  s^  > s^,  dividing  by  s^  - gives 


r^j^ 

S2  - 


E(m). 


(4-3) 


Therefore,  if  equation  (4-3i  is  satisfied,  action  a^  is  optimal.  If 
the  inequality  is  reversed,  action  a^  is  optimal.  For  this  decision 
making  problem,  is  called  the  breakeven  value  of  p: 


(4-4) 


Figure  13  displays  p^^  pictorially. 

If  the  expected  value  of  p is  less  than  p^,  action  a.j  is  optimal; 
if  it  is  greater  than  pj^,  action  a^  is  optimal;  "if  it  is  equal  to  p^, 
the  payoffs  are  equal,  and  the  decision  maker  should  be  indifferent 
toward  each  action. 

k'' Eoii_  Function 

If  action  a^  is  chosen  and  the  true  value  of  p is  really  greater 


than  P|^,  then  an  opportunity  loss  has  been  suffered  by  not  having  chosen 


.1 , .U)il  is  ijivi't)  by 


On  I hi'  ol  h(’i-  haiul . 
less  ( h.iti  ii|^.  f lii'ti 


I i i)uri'  1 I'.iyol  I vs  n . 


lU •>  1 • i' ) - lU •' I < n ) (4-!i) 

- (>•,  * '^,n) 

(>■  . - >'|  ) ' (s  , - Sj  )|,.  (4-t') 

it  .\iiinn  ,i  , wnri'  i Iuim'H  niul  t lu'  ( nu'  v.\  I in'  ot  n is 
till'  (i(i|iitr(  UM  i ( y Inss  is 

I - K(.i.,.n)  (4-;) 

'■)  ' ''ll'  - ('■,  ' s„l,) 

('•,  - '•,)  ' (S,  - S^,)„  (4-0) 


It  ,1.  wi'fi'  I host'll  .iiui  lilt'  trill'  v.iliii'  lit  11  is  loss  t li.in  u,  , t ht'ii  (In' 


opportunity  loss  would  be  0.  If  ^2  were  chosen  and  the  true  value  of  m 


is  greater  than  the  opportunity  loss  is  also  0.  The  loss  functions 
for  a-j  and  a^  are  summarized  below: 


l-(a^,ti)  = 


if  M < 


[(r^-r^  ) + (s2-s^  )ii  if  u 


(4-9) 


0 if  I,  - IK 


(4-10 


The  relationship  between  the  payoff  and  loss  functions  is  shown  in 
Figure  14. 
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The  loss  functions,  L(app)  and  Lla^,!!)  are  shown  in  Figures  15  and  16. 
It  is  obvious  from  these  figures  that  the  loss  functions  are  related 
to  the  value  of  the  breakeven  point  as  given  in  equation  (4-4).  If 


M > U 


b’ 


( r ” r ) 


(s2“S-j){-p|^)  (s2~Si)u 
- ( S2~Si ) {\i  • 

Similarly,  for  ^ 

f ^ ~ ( ^2 ~ ^ 1 ^ ^ ~ p ) • 


Therefore,  the  loss  functions  can  now  be  written 


L(a^  ,it) 


L(a2,p) 


f “ 

((Sj-S, 

[ 

I » 


l)(nb-w) 


v‘  1 

" - "b 

V 1 Mb 
M 1 Mb 


(4-11) 


(4-12) 


The  expected  value  of  each  loss  function  depends  on  the  distribution  of 
u and  i s gi ven  by  [36) 


EL(a^)  = 

/ (M-Mb^P'^'^M 

% 

(4-13) 

^'b 

FL(a2)  = 

( S2-S| ) / (m  bT  )f(>i  )du 

— 0 

(4-14) 

a 
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The  integrals  in  equations  (4-13)  and  (4-14)  are  called  right  hand 
and  left  hand  linear  loss  integrals,  respectively.  Formulas  for 
tabulation  of  the  above  integrals  are  given  in  [36]  for  various  conju- 
gate distributions.  The  loss  functions  given  in  equations  (4-11)  and 
(4-12)  are  valid  for  both  the  classical  and  Bayesian  analyses.  The 
difference  in  the  two  approaches  arises  from  the  differing  decision 
criteria  in  each  analysis. 

Comparison  of  Decisions 

In  the  classical  analysis,  action  a^  (reject  new  equipment)  is 
taken  if  the  null  hypothesis,  u £ 0,  is  accepted,  while  action  a^ 

(purchase  new  equipment)  is  taken  if  the  null  hypothesis  is  rejected. 

No  formal  consideration  is  given  to  the  value  of  or  to  the  loss 
function.  In  the  Bayesian  analysis,  however,  action  a^  is  taken  if  the 
expected  loss  due  to  a-j  is  less  than  the  expected  loss  due  to  a^,  and 
a^  is  taken  if  the  expected  loss  due  to  a2  is  less  than  that  due  to  a-j ; 
i.e.,  expected  loss  is  minimized  [36].  Consider  Figure  17,  where  two 
typical  linear  loss  functions  are  graphed. 

In  the  case  where  the  classical  analyst  accepted  the  null  hypothe- 
sis, resulting  in  action  a-j , if  < 0,  then  the  loss  given  by  L(a^u) 

would  still  be  incurred  for  values  of  u between  0 and  ii|^.  even  though 
is  true.  If  were  actually  false,  and  the  true  u is  greater  than  0 
(i.e.  a type  II  error),  then  the  losses  are  even  greater.  If,  however, 

> 0,  then  a loss  is  incurred  by  choosing  a^  only  if  the  true  u is 
greater  than  This  would  also  be  a type  II  error.  Thus,  the  classical 
analyst  may  incur  a loss  Lla^ij)  by  accepting  H^,  if  he  has  made  a type 


II  error  or  no  error  at  all,  in  terms  of  hypothesis  testing. 


Similarly,  if  the  classical  analyst  rejected  the  null  hypothesis, 
he  would  choose  action  a^-  If  < 0,  the  loss  given  by  L(a2,p)  would 
be  incurred  if  the  true  value  of  p is  less  than  p^  (a  type  I error).  If 
P|^  < 0 then  a loss  given  by  L(a2,p)  is  incurred  for  values  of  p between 
0 and  P|^  even  though  he  correctly  rejected  H^.  Thus,  a loss  given  by 
L(a2,p)  may  be  incurred  by  making  a type  I error  or  no  error  at  all. 

The  above  discussion  points  out  that  by  not  considering  the  break- 
even point  or  loss  function  in  his  analysis,  the  classical  analyst  is 
very  likely  to  incur  higher  losses,  even  when  he  chooses  the  hypothesis 
which  is  true,  than  the  Bayesian  who  chooses  the  action  with  the  least 
expected  loss. 

The  LWCMS  OT  II  problem  will  again  be  used  to  demonstrate  the 


above  procedures. 
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niustrating  the  Procedure 
Consider  the  payoff  functions  given  by 

R(a^ ,m)  = -100  - 20  u 

R(a2*u)  “ -250  + 10  y 

A reasonable  explanation  of  such  payoff  functions  could  be  as  follows. 
Action  a-j  corresponds  to  rejecting  the  new  equipment.  If  testing  the 
equipment  costs  100  units  and  the  decision  maker  considers  a penalty 
cost  of  20  units  for  each  unit  of  y above  0,  he  would  be  expressing  the 
importance  he  attaches  to  the  actual  mean  difference,  y,  between  the 
MOE  of  the  competing  systems.  As  y becomes  more  positive,  the  new  piece 
of  equipment  becomes  much  better  than  the  old  and  the  more  costly 
(negative  payoff)  becomes  the  decision  of  having  chosen  action  a.| . 

Action  di2  indicates  that  the  new  system  has  been  chosen.  The  cost 
of  sampling  plus  purchase  is  equal  to  250  units,  and  the  decision  maker 
attaches  a payoff  of  10  units  per  unit  of  y. 

Using  equation  (4-4), 


_ ’"r'^2  _ -100-  (-250)  150 

^b  " s^-s^  10-  (-20)  30 


From  equations  (4-11)  and  (4-12) 


(4-15) 


L(a^ ,y ) 


UO  (y-5) 


y 5 
y > 5 


(4-16) 


T" 
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■30(5-u)  p 1 5 

L(a.,p)=\  (4-17) 

I 0 p > 5 


From  equations  (4-13)  and  (4-14) 


EL(aJ  = 30  / (p-5)f(p)dp  (4-18) 

‘ 5 


5 

EKa^)  = 30  / (5-p)f(p)dM  (4-19) 


It  has  been  shown  [29]  that  if  p follows  the  student  density,  as  it 
does  in  this  example,  then 


/ (z-p)f^(2 |m,n/v,v)dz  = L^*(tlv)/v/n  (4-20) 

p 

p 

and  / (p-z)f^(z  lm,n/v,v)dz  = L^*(-t  | v)v^^"  , (4-21) 

~ CO 

where 


t = (p-m)/n/v 
+ 

Lg*(t|v)  = f^*(t  I v)  - tGg*(tlv) 

G^*(t|v)  = 1 - F^*(t|v)  . 

Values  of  fg*(t|v)  are  given  in  [29]  Table  I. 

The  expected  losses  given  in  equations  (4-18)  and  (4-19)  could 
be  computed  from  either  the  prior  or  posterior  distributions  for  p. 
Since  the  decision  will  be  made  in  the  classical  case  after  the  sample 
has  been  taken,  the  posterior  distribution  will  be  used. 
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As  given  in  Chapter  III  the  prior  distribution  of  u before  test- 
ing in  OT  II  has  parameters 


(m',v',n',v’)  = (17.6,  2040.5,  14,  13) 

The  sample  data  given  in  Appendix  3 for  OT  II  produced  the  statistic 
{m,v,n,v)  = (72.95,  1517.9,  30,  29).  Thus  the  parameters  of  the  poster- 
ior distribution  of  ii,  as  given  by  equation  (2-7),  are 

n'm'+nm  _ (14)(17.6)  + (30)(72.95'  .r  3/. 

- nTn' MT-TA 

n"  = n + n'  = 30  + 14  = 44 

„ _ [v' v' + n‘ (m' )^]  + (w  + nm^)  - n"(m")^ 

[vv'  + 6(n' )]  + [v+  6(ny]  - 6(n''l 

(13) (2040. 5)+( 14) (17.6)^+  (29) ( 1 51 7 .9 ) + ( 30) (72 .95 )^- (44 ) ( 55.341^ 
13+1+29+1-1 

= 2320.5 

v"  = [v'  + 6(n' )]  + [v  + 6(n)]  - 6(n") 

= 13  + 1 + 29  + 1 - 1 = 43 

(0  x=  0 

1 X > 0 

Thus  (m",v",n",v")  = (55.34,  2320.5,  44,  43). 

To  evaluate  the  linear  loss  integrals  in  equations  (4-18)  and 


(4-19), 
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t = (m(^-  ni'‘)/fr7V"'  = (5-  55.34)/44/2320.5  = -6.9 
G5*(t|v")  = 1-F^*(t|v")  = 1 - F5*(-6.9j43)  = 1 
G3*(-tlv’')  = 1 - F3*(-tlv")  = 1 - F5*(6.9|43)  = 0 

L5*(tlv")  = fs*(tlv")  - t G5*(tlv") 

2 

L3*(-6.9|43)  = f3*(-6.9|43)  - (6 . 9)G3*(-6. 9|  43) 

= (2. 16)(-. 00000015)  + (6.9)1 
= 6.9  (4-22) 

L3*(-t|v")  = L5*(6.9143) 

2 

= 43_^_(6^  ^^^(g_g|43)  _ g g Gg*(6.9|43) 

= (2. 16)(. 00000015)  - 6.9(0) 

L3*(6.9|43)  : 0 (4-23) 

Using  the  L^*  calculated  in  (4-22)  and  (4-23) 

/ (y-  5)f(u)dM  = L3*(-6.9|43)/^ 

5 ^ 

=6.9  /2320.5/44 

= 50.1 
5 

/ (5- y)f(y)du=  L3*(6.9|43)/v^" 

» 00 

= 0 


J 


56 


EL(a^)  = 30(50.1)  = 1500.3 
EL(a2)  = 30(0)  = 0 

The  Bayesian  would,  therefore,  choose  action  32  and  buy  the  new  equip- 
ment. 

In  the  classical  analysis,  using  p ^ 0 vs  : m > 0,  the 

"X  _ 0 

statistic  t = would  be  computed  and  the  null  hypothesis  would  be 

° s//n 

rejected  if  t > t „ , [12].  In  this  example, 

0 a 5 n-  I 


t = = 10.26 

° 38.96//30 


^.05,29  " 


Therefore,  the  classical  analyst  would  reject  the  null  hypothesis  and 
also  choose  action  32*  Since  the  data  for  this  particular  problem  has  a 
mean  so  much  greater  than  0,  one  should  expect  both  methods  to  reach  the 
same  decision.  A better  comparison  would  result  from  a sample  with  a 
mean  closer  to  zero.  Consider  the  case  where  the  sample  results  in  a 
mean  of  X = 10,  with  the  same  sample  variance.  Now,  the  classical. 


analyst  would  not  reject  H since  t = 


= 1.41,  which 


is  less  than  t 


05,29 


^ ^ s/v^  38.96//30 

= 1.697.  The  classical  analyst  would  then  choose 


action  a.|  and  reject  the  new  equipment.  On  the  other  hand,  the  Bayesian 
would  recompute  EL(a.|)  and  EL(a2).  The  new  parameters  of  the  posterior 
distribution  of  p are 


Thus,  from  equations  (4-18)  and  (4-19) 
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m"  - (14)(17.6)  + (30)(10)  _ .. 

n"  = 44 

(13)(2040.5)  + (14)(17.6)^+  (29 ) (1  51 7 . 9)  + (30 ) (10)^  - (44)(12.42)^ 

43 

= 1653.37 
v"  = 43 

The  t value  in  equations  (4-18)  and  (4-19)  is 

t = (5  - 12.42)/44/1653.37  = -1.21 

Gg*(-1.21 |43)  = 1 - F^*(-l .21 |43)  = .8814 

G^*(1.21|43)  = 1 - F5*(1.21|43)  = .1186 

2 

L^*(-1.2l|43)  = ~ f 3*  ( - 1 . 21  I 43  ) - (-1  .21  )G5*(-1  .21  | 43) 

= 1.059  (-.19)  + 1.21(.8814) 

= .865 

L^*(1.2ll43)  = f5*(l  .21!  43)  - ( 1 . 21  )G5*(  1 . 21 1 43) 

= (1.059)(.19)  - 1.21(.1186) 

= .058 

CD 

EL(aJ  = 30  / (u-5)f(u)dM 

' 5 

= 30  L^*(-1.21|  43)/T65T.“37743 
= 30(.865)(/r6r3T37743  ) 

= 160.91 

I 

1 

L 
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El{a^)  = 30  / (5-ij)f(M)du 

— iw 

= 30  L5*(1.21|  43)/T653.37/43 
= 30  (.058)/1653. 37/43 
= 10.79 

Since  £1(32)  < EL(a.|),  the  Bayesian  would  choose  action  a^  and  buy  the 
new  equipment.  In  this  example,  the  classical  analyst  chose  the  decision 
which  had  the  higher  expected  loss.  This  resulted  from  considering  only 
the  true  value  of  u and  not  the  effect  of  the  value  of  u on  the  loss 
which  could  be  incurred  from  each  decision. 

Although  this  example  considers  only  the  linear  loss  function, 
the  conclusions  resulting  from  the  example  are  valid  for  all  loss  func- 
tions. Since  the  decision  maker  is  ultimately  concerned  with  choosing 
the  action  which  will  minimize  his  losses  (or  maximize  his  payoffs),  it 
is  imperative  for  him  to  formally  assess  his  loss  or  payoff  function. 

Once  this  is  done,  he  can  base  his  decision  on  the  action  which  has  the 
least  expected  loss  or  greatest  expected  payoff,  rather  than  on  the  true 
value  of  some  statistic. 

It  can  be  seen  from  equations  (4-18)  through  (4-21)  that  there  is 
a relationship  between  the  sample  size  and  the  expected  loss  from  each 
action.  The  sample  size  affects  both  the  degrees  of  freedon,  v,  and  the 
value  of  t,  as  well  as  the  values  of  the  integrals  in  equations  (4-20) 
and  (4-21).  It  is  possible  that  a sample  size  could  be  determined  which 
would  minimize  the  expected  loss  of  each  action,  but  such  a determination 
is  beyond  the  scope  of  this  study. 
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i;  CHAPTER  V 

1’ 

[■ 

‘ CONCLUSIONS  AND  RECOMMENDATIONS 

I Conci usions 

: The  conclusions  of  this  study  must  be  considered  from  two  dis- 

S tinct  viewpoints.  The  first  is  that  of  hypothesis  testing.  If  the 

decision  maker  is  interested  purely  in  testing  one  hypothesis  against 
another,  such  as  u = 0 vs  : \i  f 0,  there  are  several  disadvan- 
tages to  utilizing  Bayesian  statistical  procedures. 

The  hypotheses  of  interest  may  not  be  meaningful  from  a Bayesian 
viewpoint,  particularly  for  the  two-tailed  test.  In  fact,  to  utilize 
Bayesian  statistical  procedures,  the  decision  maker  must  alter  his  con- 
ception of  the  mean  and  variance  of  a distribution  of  a random  variable 
as  discussed  in  Chapter  I.  With  the  Bayesian  conception  of  a random 
variable  in  mind,  the  decision  maker  must  formulate  a new  hypothesis  to 
be  tested  which  he  feels  will  provide  him  with  information  equivalent  to 
that  which  he  would  have  obtained  from  the  classical  hypothesis  test. 

An  example  of  this  was  given  in  Chapter  III  with  p = 0 vs  H.|  : p f 0. 
Once  the  alternate  hypotheses  have  been  formulated,  they  can  be  tested 
using  Bayesian  statistical  procedures.  However,  it  was  shown  in  Chapter 
III  that  when  the  probability  of  a type  I error  was  held  constant,  the 
Bayesian  test  was  less  powerful  than  the  classical  in  the  meaningful 
range  of  values  for  the  power.  When  the  BPI  was  kept  constant,  the 
Bayesian  test  was  also  less  powerful  than  the  classical  test  for  large 


(H) 


v.ilui's  ot  t fu'  powi'f  with  till'  aitdifioti.ll  d i sadv.int  .uk'  th.it  t ho  (iroh.i- 
liility  ot  .1  typo  1 ornir  MuriMsod  with  t ho  sample'  si’o.  Wlu'ii  tho  vari- 
al'ilify  ot  till'  s.iiiiplo  moan  was  .issumod  to  ho  indopoiulont  ot  t lio  v.ifi.i- 
I'ility  ot  tho  prior  moan,  it  w.is  shown  th.it  t horo  is  tittlo  dittoroiuo 
lu'twoon  tho  two  ty(H";  ot  tosts  in  torms  ot  powor. 

In  till'  o.iso  of  tho  ono-t.a  i I od  tost,  t tioro  aro  no.irly  oi|oiv.ilont 
tivpothosos  wtiich  ran  I'o  i n vos  t i q.it  od  with  Hayosian  .itid  itassic.il  proi  o- 
ihiro'.;  o.ii..  H : n ■ (1  vs.  II,;  a - t1  and  II  : n • d vs  M,  ; n 0.  rospor- 
livolv.  Altlunnjli  tho  proh.ihility  ot  a ty(H'  1 orror  r.in  In'  di'torminod. 
it  r.innot  tu'  t i \od  in  tho  H.iyosi.in  tost.  Also,  tho  powor  ot  tho  Bayosi.in 
tost  iiniuit  ho  iiu'.in  i lup  n 1 1 V di'tinod,  as  disrussod  in  Chapti'r  111.  1 horo- 

lori',  tho  two  typos  ot  [irorodnros  r.innot  ho  moani  ini  t n 1 I y . omp.irod  tor 
t ho  0110-  t.i  i 1 od  t ost  . 

I ho  sorond  viowi'oint  from  whirh  tho  roiir  1 ns  i ons  must  ho  roiisidorod 
IS  that  ot  tho  dorision  rritoria.  It  tho  dorision  m.iki'r  r.in  tonnallv 
iti'siriho  tho  loss  tiiiirtion  in  rol.it  ion  to  o.u  h ot  tho  (lossihlo  dorisions 
ho  m.iy  m.iko,  li.iyosi.in  statistir.il  proroduros  h.ivo  lioon  dovolopod  whirh 
will  on.ililo  him  to  inaki'  t lu'  dorision  whirh  h.is  t h.'  lo.ist  oxpoitod  loss. 

In  fh.iptor  IV  .in  or.implo  w.is  providod  to  domonstrato  tho  proi  oduri's  ui 
tho  ( .isi'  ot  a lini'.ir  loss  tunrtion.  Sinro  tho  rl.issir.il  dorision  m.ikor 
dill's  not  form.illy  roii'.idor  .i  toss  tunrtion  .iiid  h.isos  his  dorision  on  tho 
ro'ai  1 1 ot  .1  hypot  Ill's  is  tost,  ho  m.iy  m.iko  .i  dorision  whirh  would  not  mini- 
mi.'o  his  I'xportod  loss.  I rom  this  viowpoint.  thorolon',  r'.ivosi.m  st.i- 
lisliial  prorodiiri".  .iro  l.ir  siiporior  to  rl.is-iir.il  st  ,it  i st  ir.i  1 proroduros. 

Ihorotoro.  it  tho  .tori,  ion  m.ikor  is  intorostod  puroly  in  tost  inn 
0110  hvi'othosis  .uiaiiist  .inottior,  ho  should  uso  r l.i'.sii  .il  '.t  .it  ist  ir.il 
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procedures.  However,  if  he  is  interested  in  making  a decision  which 
has  the  least  expected  loss,  he  should  use  Bayesian  statistical  proce- 
dures . 

Recommendati ons 

In  Chapter  II  it  was  stated  that  one  of  the  objectives  of  the 
Bayesian  methodology  was  to  determine  the  minimum  sample  size  from  which 
meaningful  probability  statements  could  be  made  regarding  p.  In  this 
study  an  attempt  was  made  to  determine  the  sample  size  which  would  pro- 
duce a desired  power.  It  is  recommended  that  some  other  measure  of  a 
"meaningful  probability  statement"  be  investigated  to  reduce  the  sample 
size  now  being  used  by  OTEA. 

It  is  also  recommended  that  the  Bayesian  methodology  presented  in 
Chapter  IV  be  investigated  to  determine  the  effect  of  sample  size  on  the 
decision  to  be  made. 

Finally,  it  is  reconiiiended  that  Bayesian  statistical  procedures  be 
applied  to  a problan  in  which  more  than  one  MOE  is  under  investigation 
since  the  procedures  in  this  study  apply  to  a situation  in  which  only 
one  MOE  is  being  considered. 
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APPENDIX  I 


EXPLANATION  OF  NOTATION 


2 

(1 

X 

f'(0) 

f(y|o) 

f"(oly) 


Chapter  I 

mean  of  normal  density  function 
variance  of  normal  density  function 
sample  mean 
sample  variance 
prior  distribution  of  6 
likelihood  function  for  y given  o 
posterior  distribution  of  o 


f^(u  |m,n/v,v) 
m ' , V ' , n ' , V ' 


m ,v  ,n  ,v 


in,  v,n,v 


F^*( • 1 v) 


u 


Chapter  1 1 

density  function  for  Student's  t-distribution 
prior  parameters  for  Student 's  t-densi ty  function 
(these  are  interpreted  on  page  10 ) 

posterior  parameters  for  Student's  t-density  function 
(these  are  defined  mathematically  on  page  9 ) 
parameters  of  a normal  sampling  distribution  (these  are 
defined  mathematically  on  page  11) 

left  tail  cumulative  distribution  function  for  standard 
Student's  density  function  with  v degrees  of  freedom 
expected  value  of 
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V I 

>'  II 

I 

u 


/ n 


s 


V II 

M 


n 

u 


It 


d" 


R(a.,ii) 


"b 

L ( a ^ , ij ) 


EL(a.) 
G^*( • I v) 


prior  variance  of  m 

prior  standard  deviation  of  p 

prior  mean  of  fi 

posterior  standard  deviation  of  |j 

ratio  of  expected  posterior  standard  deviation  of  w to 
prior  standard  deviation  of  p 
posterior  variance  of  fi 
posterior  mean  value  of  ii 

Chapter  III 

H' n_ 
n+n' 

type  I error 
type  1 1 error 

test  statistic  for  classical  hypothesis  test 

length  of  a (1  - y)  Bayesian  prediction  interval  on  the 

posterior  distribution  of  m 

Chapter  IV 

payoff  function  of  the  decision,  a^,  and  the  true  value 
of  M,  11 

breakdown  value  of  u 

loss  function  of  the  decision,  a^,  and  the  true  value  of 
Ti , n 

expected  loss  if  action  a^  is  chosen 

right  tail  cumulative  distribution  function  for  the 

standard  Student's  density  function  with  v degrees  of 


freedom 
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standard  Student  density  function  with  v degrees  of 
freedom 


L3*(-|v)  partial  evaluation  of  linear  loss  integral  for  standardized 

Student  density  function  with  v degrees  of  freedom 


r 
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APPENDIX  II 


LIGHTWEIGHT  COMPANY  MORTAR  SYSTEM  OT  I TEST 

Gunner's  Examination  Times  fl9] 

DATA 

System 

Test 

81  mm 

LWCMS 

Difference  in 

Par^ijnpant 

iie£L 

(sec) 

Performance 

1 

358.0 

303.4 

54.6 

2 

367.0 

350.8 

16.2 

3 

299.0 

330.0 

-31.0 

4 

261.0 

147.5 

113.5 

5 

380.0 

313.0 

67.0 

6 

226.8 

250.0 

-23.2 

7 

272.0 

247.0 

25.0 

8 

239.8 

273.0 

-33.2 

9 

235.0 

258.0 

-23.0 

10 

247.5 

244.8 

2.7 

11 

279.1 

242.7 

36.4 

12 

303.0 

234.2 

68.8 

13 

240.9 

250.  7 

-9.8 

14 

2'9.0 

296.9 

-17.9 
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APPENDIX  III 

LIGHTWEIGHT  COMPANY  MORTAR  SYSTEM  OT  II  TEST  DATA 
Gunner's  Examination  Times  [201 


Systems 


Test 

Participant 

81  mm 
(sec) 

LWCMS 

(sec) 

Difference  in 
Performance 

1 

321.5 

225.5 

96.0 

2 

310.0 

194.5 

115.5 

3 

314.0 

248.0 

66.0 

4 

293.0 

272.5 

20.5 

5 

304.5 

259.0 

45.5 

6 

256.0 

173.0 

83.0 

7 

321.5 

224.0 

97.5 

8 

397.5 

256.0 

141.5 

9 

297.5 

282.0 

15.5 

10 

254.5 

220.0 

34.5 

11 

258.0 

262.0 

-4.0 

12 

294.5 

177.5 

117.0 

13 

279.0 

255.0 

24.0 

14 

316.0 

186.0 

130.0 

15 

288.0 

216.0 

72.0 

16 

317.5 

204.5 

113.0 

17 

325.0 

245.0 

80.0 

18 

326.0 

289.5 

36.5 

19 

321.5 

269.5 

52.0 

20 

308.5 

205.5 

103.0 

21 

311.5 

211.0 

100.5 

22 

322.0 

213.5 

108.5 

23 

297.0 

200.0 

97.0 

24 

316.0 

272.5 

43.5 

25 

261.0 

208.5 

52.5 

26 

335.0 

208.5 

126.5 

27 

274.5 

243.5 

31.5 

28 

270.0 

200.0 

70.0 

29 

342.5 

257.5 

85.0 

30 

314.5 

280.5 

34.5 
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